Re-assessing the WMT2013 Human Evaluation with Professional Translators Trainees
نویسندگان
چکیده
This paper presents experiments on the human ranking task performed during WMT2013. The goal of these experiments is to re-run the human evaluation task with translation studies students and to compare the results with the human rankings performed by the WMT development teams during WMT2013. More specifically, we test whether we can reproduce, and if yes to what extent, the WMT2013 ranking task and whether specialised knowledge from translation studies influences the results in terms of intraand inter-annotator agreement as well as in terms of system ranking. We present two experiments on the English-German WMT2013 machine translation output. Analysis of the data follows the methods described in the official WMT2013 report. The results indicate a higher interand intra-annotator agreement, less ties and slight differences in ranking for the translation studies students as compared to the WMT development teams.
منابع مشابه
A Conceptual Pattern for Assessing and Monitoring the Educational Performance of Technical and Professional Colleges with Emphasis on Human Resources
this study was conducted in 1398 with the aim of recognizing the appropriate pattern of assessment and monitoring the educational performance of technical and professional colleges with emphasis on human resources. in order to answer it in this study, qualitative research with the descriptive - survey method was conducted with the views of 16 experts working in vocational school, through interv...
متن کاملDevelopment of the Human Factors Skills for Healthcare Instrument: a valid and reliable tool for assessing interprofessional learning across healthcare practice settings
Background A central feature of clinical simulation training is human factors skills, providing staff with the social and cognitive skills to cope with demanding clinical situations. Although these skills are critical to safe patient care, assessing their learning is challenging. This study aimed to develop, pilot and evaluate a valid and reliable structured instrument to assess human factors s...
متن کاملCorrelating Translation Product and Translation Process Data of Professional and Student Translators
The paper presents an exploratory study of the translation processes for 12 student and 12 professional translators. We relate properties of the translators’ process data (eye movements and keystrokes) with the quality of the produced translations, using BLEU scores and human evaluation scores for fluency and accuracy to assess translation quality. We also investigate how BLEU scores correlate ...
متن کاملA tool for rapid manual translation
There have been several attempts to realize the idea of a fully automatic translation system for text translation to replace human translators. By contrast, little work has been put into building tools to aid human translators. This report describes the ideas behind such a tool. The tool is intended to aid human translators in achieving higher productivity and better quality, by presenting term...
متن کاملA Dataset for Assessing Machine Translation Evaluation Metrics
We describe a dataset containing 16,000 translations produced by four machine translation systems and manually annotated for quality by professional translators. This dataset can be used in a range of tasks assessing machine translation evaluation metrics, from basic correlation analysis to training and test of machine learning-based metrics. By providing a standard dataset for such tasks, we h...
متن کامل